Dual-tree $k$-means with bounded iteration runtime

نویسنده

  • Ryan R. Curtin
چکیده

k-means is a widely used clustering algorithm, but for k clusters and a dataset size of N , each iteration of Lloyd’s algorithm costs O(kN) time. Although there are existing techniques to accelerate single Lloyd iterations, none of these are tailored to the case of large k, which is increasingly common as dataset sizes grow. We propose a dual-tree algorithm that gives the exact same results as standard k-means; when using cover trees, we use adaptive analysis techniques to, under some assumptions, bound the single-iteration runtime of the algorithm as O(N + k log k). To our knowledge these are the first subO(kN) bounds for exact Lloyd iterations. We then show that this theoretically favorable algorithm performs competitively in practice, especially for large N and k in low dimensions. Further, the algorithm is treeindependent, so any type of tree may be used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Continuous $ k $-Frames and their Dual in Hilbert Spaces

The notion of $k$-frames was recently introduced by Gu avruc ta in Hilbert  spaces to study atomic systems with respect to a bounded linear operator. A continuous frame is a family of vectors in a Hilbert space which allows reproductions of arbitrary elements by continuous super positions. In this manuscript, we construct a continuous $k$-frame, so called c$k$-frame along with an atomic system ...

متن کامل

$varphi$-CONNES MODULE AMENABILITY OF DUAL BANACH ALGEBRAS

In this paper we define $varphi$-Connes module amenability of a dual Banach algebra $mathcal{A}$ where $varphi$ is a bounded $w_{k^*}$-module homomorphism from $mathcal{A}$ to $mathcal{A}$. We are mainly concerned with the study of $varphi$-module normal virtual diagonals. We show that if $S$ is a weakly cancellative inverse semigroup with subsemigroup $E$ of idemp...

متن کامل

Plug-and-play dual-tree algorithm runtime analysis

Numerous machine learning algorithms contain pairwise statistical problems at their core— that is, tasks that require computations over all pairs of input points if implemented naively. Often, tree structures are used to solve these problems efficiently. Dual-tree algorithms can efficiently solve or approximate many of these problems. Using cover trees, rigorous worstcase runtime guarantees hav...

متن کامل

An Algorithm for Multicast Tree Generation in Networks with Asymmetric Links

W e formulate the problem of multicast tree generation in asymmetric networks as one of computing a directed Steiner tree of minimal cost. We present a new polynomial-time algorithm that provides for tradeoff selection, using a single parameter K , between the tree-cost (Steiner cost) and the runtime efficiency. Using theoretical analysis, we (1 show that it is highly with a performance guarant...

متن کامل

‎Bounded approximate connes-amenability of dual Banach algebras

 We study the notion of bounded approximate Connes-amenability for‎ ‎dual Banach algebras and characterize this type of algebras in terms‎ ‎of approximate diagonals‎. ‎We show that bounded approximate‎ ‎Connes-amenability of dual Banach algebras forces them to be unital‎. ‎For a separable dual Banach algebra‎, ‎we prove that bounded‎ ‎approximate Connes-amenability implies sequential approximat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1601.03754  شماره 

صفحات  -

تاریخ انتشار 2016